Decision errors: False Negatives

Elizabeth King
Kevin Middleton

Decision errors

Reject H0 Fail to reject H0
H0 is true Type I error Correct
H0 is false Correct Type II error

False positive (Type I error):

  • You decide there is an effect when in reality there is not
    • P is small by random chance, given that \(\alpha\) is chosen ahead of the test

False negative (Type II error) probability depends on:

  • You decide there is no effect when in reality there is
    • Depends on the value of \(\alpha\) & how “wrong” H0 is
    • Random chance leads to the estimated effect being smaller than it is in reality

Uncertainty and Decisions

  • All decisions should be accompanied by estimates of your uncertainty in your decision
    • Intervals
    • Decision error rates
      • e.g. false positive rate, false discovery rate, false negative rate

Power

  • Given a true effect, the probability that a random sample will lead to a rejection of H0
    • The proportion of times you DO NOT make a Type II error
  • Dependent on how different the truth is from the null hypothesis
  • Inversely related to type II errors
    • High power \(\rightarrow\) low false negative rate
    • Low power \(\rightarrow\) high false negative rate

Power depends on effect size

Effect size: The magnitude of the deviation from H0.

If we can estimate effect size before we do the study, we can estimate the power.

  • Use previous information
    • Your own pilot studies
    • Other similar studies
  • Determine how big a difference we want to be able to detect
    • How small of a difference is not biologically meaningful?

Monte Carlo approaches to estimating the false negative rate

  1. Simulate data mimicking your expected dataset many times
  2. Do your planned analysis on each simulated dataset
  3. Ask how often you detect an effect when one is present

General Considerations

  • What distribution is appropriate for sampling?
  • What effect sizes are reasonable to consider?
  • What are the sources and magnitudes of variation?

Power analysis via simulation

Simulate data from a exponential process:

\[\mbox{Femur Length} = a \mbox{Mass}^b\]

\[\log \mbox{Femur Length} = \log a + b \log \mbox{Mass}\]

What is the power to detect deviations from isometry?

  • Simulate across a range of n
  • Use a range of slopes from 1/3 - 0.2 to 1/3 + 0.2

Power analysis via simulation

nsims <- 1e4
alpha <- 0.05
ns <- c(5, 10, 25, 50, 100, 200, 400)
b_null <- 1/3
b_devs <- seq(-0.2, 0.2, length.out = 100)

# All combinations of ns and b_devs
pwr_reg <- crossing(ns, b_devs)
names(pwr_reg) <- c("n", "b_dev")
pwr_reg$Power <- NA

Power analysis via simulation

set.seed(912)
# Iterate through the rows of `pwr_reg`
for (i in 1:nrow(pwr_reg)) {
  tic <- Sys.time()
  message(i, " of ", nrow(pwr_reg))
  n <- pwr_reg$n[i]
  b_dev <- pwr_reg$b_dev[i]
  sig <- logical(nsims)
  
  for (j in 1:nsims) {
    log_Mass <- log(runif(n, 1, 1e3))
    log_a <- rnorm(n, 1.31, 0.15)
    log_Fem_Len <- log_a + (b_null + b_dev) * log_Mass
    fm <- sma(log_Fem_Len ~ log_Mass, slope.test = b_null, method = "OLS")
    sig[j] <- fm$slopetest[[1]]$p < alpha
  }
  pwr_reg$Power[i] <- mean(sig)
  save(pwr_reg, file = "Data/pwr_reg_SMA.Rda")
  message(Sys.time() - tic)
}
  • Mass uniformly distributed from 1 - 1000
  • \(a\) normally distributed with a mean of 1.31
  • Calculate log_Fem_Len

Power analysis via simulation

  • 7,000,000 regressions
  • ~8 hours later…

Power analysis via simulation

General Considerations

  • What distribution is appropriate for sampling?
  • What effect sizes are reasonable to consider?
  • What are the sources and magnitudes of variation?

What distribution is appropriate for sampling?

Examples:

  • Many phenotypes: Normal distribution
    • Bounded by values?
  • Presence/absence: Binomial
  • RNA Sequencing reads: Negative binomial

What effect sizes are reasonable to consider?

  • Biologically realistic differences (Normal distribution)
  • Genetic effects: Exponential distribution
  • Presence/absence: Range from 0 - 1
  • RNA Sequencing reads?

What are the sources and magnitudes of variation?

  • Genetic effects:
    • Genetic background
    • Environmental variation
    • Measurement error
  • Presence/absence
    • Sampling error
    • Observation errors
  • RNA Sequencing reads
    • Batch effects
    • Coverage

Power in multi-parent genetic mapping populations (MPPs)

Power in MPPs

eff <- 0.05
envs <- rnorm(length(genos),0,sqrt(((1/eff)-1)*var(genos)))
phenos <- genos + envs

print(cbind(genos, phenos))
      genos     phenos
 [1,]     0 -0.1052698
 [2,]     0  0.6067555
 [3,]     1  1.9353482
 [4,]     1 -1.0639235
 [5,]     0  5.3084234
 [6,]     0  0.6366667
 [7,]     1 -0.6613659
 [8,]     0  1.5492566
 [9,]     1  0.6933317
[10,]     0  1.1229054
print(c(mean(phenos[genos==0]),mean(phenos[genos==1])))
[1] 1.5197897 0.2258476

Power in MPPs

Power in MPPs

Power for craniofacial growth

Power for craniofacial growth

Power for craniofacial growth

Growth function

\[ length(age) = \frac{a_1}{1 + \exp{(-b_1 (age - c_1))}} + \frac{a_2}{1 + \exp{(-b_2 (age - c_2))}}\]

  • \(a_1\) and \(a_2\) are lengths in mm
  • \(b_1\) and \(b_2\) are rates in mm / year
  • \(c_1\) and \(c_2\) are ages in years

Growth function

Power to detect differences in parameters

\[ length(age) = \frac{a_1 + \Delta a_1}{1 + \exp{((-b_1 + \Delta b_1) (age - c_1 + \Delta c_1))}} + \\ \frac{a_2 + \Delta a_2}{1 + \exp{((-b_2 + \Delta b_2) (age - c_2 + \Delta c_2))}}\]

Power to detect differences in parameters (3,400 models)

##    a1_delta a2_delta b1_delta b2_delta c1_delta c2_delta     n median_elpd_diff
##  1        0        0     0        0         0        0.5   200             7.32
##  2        0        0     0        0         0        1     200            30.7 
##  3        0        0     0        0         0        2     200           110.  
##  4        0        0     0        0         0.5      0     200            11.9 
##  5        0        0     0        0         1        0     200            50.1 
##  6        0        0     0        0         2        0     200           211.  
##  7        0        0     0        0.02      0        0     200             2.94
##  8        0        0     0        0.05      0        0     200             2.78
##  9        0        0     0        0.1       0        0     200             2.44
## 10        0        0     0.02     0         0        0     200      3.74e- 2
## 11        0        0     0.05     0         0        0     200      6.68e- 8
## 12        0        1     0        0         0        0     200      9.61e- 2
## 13        0        2     0        0         0        0     200      1.14e- 4
## 14        0        3     0        0         0        0     200      2.23e- 9
## 15        1        0     0        0         0        0     200      3.09e- 1
## 16        2        0     0        0         0        0     200      2.34e- 1
## 17        3        0     0        0         0        0     200      2.23e- 1

Power to detect differences in parameters

nonzero |> 
  filter(a2_delta != 0) |> 
  group_by(a2_delta) |> 
  summarise(power = mean(if_else(a2_Q2.5 > 0 | a2_Q97.5 < 0, 1, 0)),
            .groups = "drop") |> 
  as.data.frame()
##   a2_delta power
## 1        1 0.000
## 2        2 0.300
## 3        3 0.985
##   b1_delta power
## 1     0.02  0.23
## 2     0.05  0.80
##   c2_delta power
## 1      0.5 0.890
## 2      1.0 0.925
## 3      2.0 0.935

When to use Monte Carlo for estimating false negative rates?